Support vector regression model for BigData systems

نویسنده

  • Alessandro Maria Rizzi
چکیده

Nowadays Big Data are becoming more and more important. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. In such systems, capacity allocation, which is the ability to optimally size minimal resources for achieve a certain level of performance, is a key challenge to enhance performance for MapReduce jobs and minimize cloud resource costs. In order to do so, one of the biggest challenge is to build an accurate performance model to estimate job execution time of MapReduce systems. Previous works applied simulation based models for modeling such systems. Although this approach can accurately describe the behavior of Big Data clusters, it is too computationally expensive and does not scale to large system. We try to overcome these issues by applying machine learning techniques. More precisely we focus on Support Vector Regression (SVR) which is intrinsically more robust w.r.t other techniques, like, e.g., neural networks, and less sensitive to outliers in the training set. To better investigate these benefits, we compare SVR to linear regression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector regression with random output variable and probabilistic constraints

Support Vector Regression (SVR) solves regression problems based on the concept of Support Vector Machine (SVM). In this paper, a new model of SVR with probabilistic constraints is proposed that any of output data and bias are considered the random variables with uniform probability functions. Using the new proposed method, the optimal hyperplane regression can be obtained by solving a quadrati...

متن کامل

Prediction of daily evaporation using hybrid support vector regression-firefly optimization algorithm and multilayer perceptron

Prediction of daily evaporation is a valuable and determinant tool in sustainable agriculture and hydrological issues, especially in the design and management of water resources systems. Therefore, in this study, the ability of artificial intelligence models of multi-layer perceptron (MLP), support vector regression (SVR), and the hybrid model of support vector regression-firefly optimization a...

متن کامل

The Porosity Prediction of One of Iran South Oil Field Carbonate Reservoirs Using Support Vector Regression

Porosity is considered as an important petrophysical parameter in characterizing reservoirs, calculating in-situ oil reserves, and production evaluation. Nowadays, using intelligent techniques has become a popular method for porosity estimation. Support vector machine (SVM) a new intelligent method with a great generalization potential of modeling non-linear relationships has been introduced fo...

متن کامل

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1612.01458  شماره 

صفحات  -

تاریخ انتشار 2016